Structure Index for RDF Data
نویسندگان
چکیده
In recent years, the amount of structured RDF data available on the Web has been increasing rapidly. Efficient query processing that can scale to large amounts of RDF data has become an important topic. Significant efforts have been dedicated to the development of solutions for RDF data management. Along this line of research, we elaborate on a novel data partitioning strategy, which leverages the structure of the underlying data. This structure is represented in form of a parameterized structure index we propose for (RDF) data graphs called PIG. It is not only used for data partitioning but also has been designed to accelerate the matching of graph-structured queries against RDF data. In our benchmark against state-of-theart techniques, our structure-based approach for partitioning and query processing exhibits 7-8 times faster performance.
منابع مشابه
HPRD: A High Performance RDF Database
In this paper a high performance storage system for RDF documents is introduced. The system employs optimized index structures for RDF data and efficient RDF query evaluation. The index scheme consists of 3 types of indices. Triple index manages basic RDF triples by dividing original RDF graph into several sub-graphs. Path index manages frequent RDF path patterns for long path query performance...
متن کاملAn Extensible Framework for Query Optimization on TripleT-based RDF Stores
The RDF data model is a key technology in the Linked Data vision. Given its graph structure, even relatively simple RDF queries often involve a large number of joins. Join evaluation poses a significant performance challenge on all state-of-the-art RDF engines. TripleT is a novel RDF index data structure, demonstrated to be competitive with the current state-of-the-art for join processing. Quer...
متن کاملA Distributed Process Infrastructure for a Distributed Data Structure
The Resource Description Framework (RDF) is continuing to grow outside the bounds of its initial function as a metadata framework and into the domain of general-purpose data modeling. This expansion has been facilitated by the continued increase in the capacity and speed of RDF database repositories known as triple-stores. High-end RDF triple-stores can hold and process on the order of 10 billi...
متن کاملTowards Efficient SPARQL Query Processing on RDF Data
Efficient support for querying large-scale RDF triples plays an important role in Semantic Web data management. This paper proposes an efficient RDF query engine to evaluate SPARQL queries, where the inverted index structure is employed for indexing RDF triples. We first design and implement a set of operators on the inverted index for query optimization and evaluation. Then we propose a main-t...
متن کاملUsing an index of precomputed joins in order to speed up SPARQL processing
SparQL is a query language developed by the W3C, the purpose of which is to query a data set in RDF representing a directed graph. Many free available or commercial products already support SparQL processing. Current index-based optimizations integrated in these products typically construct indices on the subject, predicate and object of an RDF triple, which is a single datum of the RDF data, i...
متن کامل